Stochastic boosting algorithms

نویسندگان

  • Ajay Jasra
  • Christopher C. Holmes
چکیده

In this article, we discuss a class of stochastic boosting algorithms, which corrects and develops the work of [23], showing how to perform statistical inference in a computationally efficient manner. Sequential Monte Carlo (SMC) methods are used to illustrate that the stochastic boosting methods can provide better predictions, for a higher computational cost, than the corresponding boosting algorithm. A theoretical result is also given, which expresses an upper-bound of the posterior-predictive test error, in terms of that of boosting. The result shows that the averaged predictions used, are relatively stable with respect to boosting, when the latter provides the single best prediction. We also investigate the method on a real case study from machine learning and in a regression context, showing that it can be a useful tool for data exploration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ada: An R Package for Stochastic Boosting

Boosting is an iterative algorithm that combines simple classification rules with ‘mediocre’ performance in terms of misclassification error rate to produce a highly accurate classification rule. Stochastic gradient boosting provides an enhancement which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble. ada is an R ...

متن کامل

On Adaptive Regularization Methods in Boosting

Boosting algorithms build models on dictionaries of learners constructed from the data, where a coefficient in this model relates to the contribution of a particular learner relative to the other learners in the dictionary. Regularization for these models is currently implemented by iteratively applying a simple local tolerance parameter, which scales each coefficient towards zero. Stochastic e...

متن کامل

Multi-class Image Classification Based on Fast Stochastic Gradient Boosting

Nowadays, image classification is one of the hottest and most difficult research domains. It involves two aspects of problem. One is image feature representation and coding, the other is the usage of classifier. For better accuracy and running efficiency of high dimension characteristics circumstance in image classification, this paper proposes a novel framework for multi-class image classifica...

متن کامل

Evaluation of Data Mining Algorithms for Detection of Liver Disease

Background and Aim: The liver, as one of the largest internal organs in the body, is responsible for many vital functions including purifying and purifying blood, regulating the body's hormones, preserving glucose, and the body. Therefore, disruptions in the functioning of these problems will sometimes be irreparable. Early prediction of these diseases will help their early and effective treatm...

متن کامل

Stochastic Regularization and Boosting for Bioinformatics

Boosting techniques are extremely popular in many domains where one wishes to learn effective classification functions and identify the most relevant variables at the same time. However, in general, these techniques do not perform well on bioinformatic data sets. One known reason that makes bioinformatics data sets particularly problematic for Boosting, is that they typically contain very few t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics and Computing

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2011